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Nature is full of random networks of complex topology describing such 
apparently disparate systems as biological, economical or informatical ones. 
Their most characteristic feature is the apparent scale-free character of in- 
terconnections between nodes. Using an information theory approach, we 
show that maximalization of information entropy leads to a wide spectrum 
of possible types of distributions including, in the case of nonextensive in- 
formation entropy, the power-like scale-free distributions characteristic of 
complex systems. 

PACS numbers: 89.75.-k 89.70.+C 24.60.-k 

Keywords: Complex networks, Information theory, Nonextensive statistics 



Random networks have recently found applications in the description of 
complex systems in different, apparently very disparate branches of mod- 
ern science such as, for example, molecular biology, sociology, economy and 
computer science For example, living organisms form huge genetic 

networks the nodes of which are proteins and links represent the corre- 
sponding chemical interactions jSJ. A similarly big network is formed by 
the nervous system the nodes of which are connected by axions flj . Compa- 
rable complexity show networks existing in the sociological systems in which 
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nodes are countries, organizations or single persons whereas links charac- 
terize their mutual interactions 0, in the world of finances and computer 
networks (with World Wide Web being the most known example where 
nodes are HTML documents connected via hiper-links URL [Hj). For most 
recent reviews of random networks see |7J [H] . 

Analysis of different random networks clearly indicate that the probabil- 
ity P(k) of joining a given node with other nodes is described by the power 
law P (k) oc A; -7 [S]. For example, the most convincing analyses of computer 
networks with over 800 milion nodes |101 H3 111! I12j lead to a power-like dis- 
tribution of P(k) with exponent equal 7 ~ 2.1 2.45. This contradicts the 
existing models of random networks |13| I14j predicting instead exponential 
distributions: P(k) oc exp(— k). The most popular model (ER) dealing with 
a fixed number of nodes iV was proposed in |13| . where the Poisson distri- 
bution was advocated to be used for probability that a given node has k 
links (with the mean number of connections being Aq) , 

\k p -Xo 

P(k) = (1) 

However, in order to get the observed power-like form of P(k) one has to 
allow for growing N and replace the democratic law of attachement a new 
link used in deriving ((J) by a preferential one. This means that distribu- 
tion P(k) is determined by the dynamics of the growth of network [71 lllj. 
Starting from a small number mo of nodes, adding in each time step a 
new node with m < uiq possible connections and assuming that this new 
node joins the already existing nodes with equal, /cj-independent, probabil- 
ity n(fcj) = , the evolution (growth) of network is described by the 
following equation [TT] : 

— - = mU(ki) = 2 

leading for long times t to exponential stationary distribution: 

P{k) = -expf--V (3) 

m \ m/ 

On the other hand, assuming that probability H(ki) is selective, for example 

that H(ki) = mo ^_i — = 7 one gets instead asymptotically a simple 

^3=1 k i 

Notice that in the limit of large k distribution £Q| can be approximated by P(k) — 
(27rfc) -1 / 2 (Xo/k) k exp (k — Xo), whereas for large values of Ao it becomes a gaussian 
distribution, P(k) = (2n\ y 1/2 exp [- (k - A ) 2 /2Ao] . 
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power law for P(k): 

P(jfe) = ^^AT 3 oc k-\ (4) 

Apparently such distribution with universal exponent 7 = 3 shows up in 
different situations (and under different names). As Pareto distribution ^3] 
it describes the growth of the wealth of persons living in stable economical 
systems, as Zipf 's law ^D] it is applied in linguistic and it also describes the 
distribution of the citations of the scientific works |17l I18j . 

In the limit of large t, i.e., when stationary state already develops, this 
problem can be also studied from the information theory point of view. In 
it one asks the following question [T^]: what is the informational content 
of data represented by distributions P{k)l In other words, what is the 
minimal number of parameters needed to reproduce a given shape of P{k)l 
The question asked in such approach is: suppose that we know only that 
a network we are interested in, which we would like to describe by P(k), 
leads to some mean value of k, i.e., we know that: 

oo oo 

P(k) = 1 and (k) = ^ k ■ P(k) = A = const, (5) 

k=l k=l 

what would be then the most probable and least biased form of P(k) in such 
a case (i.e., describing the existing data and given entirely in terms of Ao)? 
To answer this question one maximalizes the corresponding information 
entropy associated with probability distribution P(k) under constraints ©. 
The usual form of such entropy is Shannon entropy [2D] . 



S = - J2 p ( k ) lnP(fc). (6) 

k=l 

The conditions © representing our a priori knowledge of the problem lead 
to the exponential probability distribution 

closely resembling eq.© 2 ■ 



2 It is interesing to mention that additional knowledge that all entities represented by 
k are indistinguishable, which results in the necessity of introducing in the correspod- 
ning sumations the additional weight factor l/k\, changes the above distribution to 
Poisson distribution of the ER model mentioned above, cf., eq.Q. 
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There are, however, many systems with properties preventing the use 
of Shannon type of information entropy and calling for its generalization. 
For example, they posses some intrinsic fluctuations resulting in the whole 
spectrum of parametr A, P(X), with Ao = (A) being only its mean value 
|21| or they develop some correlations introducing memory effects, cf. |22| 
(in statistical physics such situation leads to the necessity of departing from 
the use of the usual Boltzmann-Gibbs statistics in favour of some sort of 
generalized one HU). O u ^ °f many possible generalizations we shall use in 
this note the nonextensive Tsallis entropy defined as |22j : 

s« = -T^-{i-E[w}. (8) 

It can be regarded as a minimal (i.e., one parameter) extension of Shannon 
entropy (JHJ), to which it reduces when q — > 1. Parameter q describes there- 
fore summarily all effects preventing the use of Shannon entropy mentioned 
above. 



Using S q as a measure of information about our system, i.e., maximal- 
izing S q with constraints (equivalent to © above): 

P(k) = 1 and (k) q = ^ rr^y = Ao = const, (9) 



one obtains as result a power-like distribution of the form: 

P(k) = Pq(k) = C 



1 - l_g ._ 



n 9 

1-9 

(10) 



where C = 1/ ££Li[l - (1 - g)A;/A ] 9/(1 " 9) = 1/A is normalization 3 . In this 

case 

(t >=(2^ " nd ' / ° rW= (3-2,)(2-^ - < U ' 

Notice that for 5 — > 1 this distribution becomes exponential, as in eq.( |7| ). 
On the other hand, for large values of k, k » Xo/(q — 1), it becomes a 
power-like distribution of the form 

P q (k) oc k" 1 with 7 = — — , (12) 

a - 1 



3 It should be stressed that maximalization of entropy provides us in this case only with 
the shape of distribution P(k), eq. 1101 . and gives no information on the particular- 
values of parameters Ao and q. Only knowledge of moments (k) and Var(k) of P q (k), 
as given by eq. 1111 . allows for determining these two parameters. 
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i.e., our distribution becomes in this limit a scale-free one. It is easy to 
check that if we demand that (k) < oo then q < 2. It is interesting to note 
at this point that 7 = 3 in eq.@ corresponds precisely to q = 3/2 at which 
variation Var{k) diverges. 




k 

Fig. 1. The probability distribution of connections in the WWW network after ^21 
(full squares). The full line shows results of our fit by using eq. IjlUII with q = 1.65 
and Ao = 1.91. It reproduces the observed mean (k) = \q/(2 — q) = 5.46 and lead 
to the asymptotic power-like distribution cx fc~ 7 with 7 = q/(q — 1) = 2.54 (showed 
as dotted line). 

In Fig. 1 we show (as example) distribution of the number of connec- 
tions in the WWW network with 325729 nodes and the mean values 
of connections equal (k) = 5.46 fitted by P q (k) as given by ea.(|10|) with 
parameters Ao = 1.91 and q = 1.65 4 . Notice that ea.()lU|) describes the 
whole range of k whereas the purely power-like distribution oc k~^ with 
7 = q/(q ~ 1) = 2-54 occurs only for large values of k. In the spirit of infor- 
mation theory this result can therefore be interpreted in the following way: 
(a) the system forming network described in Fig. 1 posseses some features 

4 Notice that although in fitting procedure both parameters were varied independently, 
they are connected via distribution moments (k) and Var(k), cf. li lt . 
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(mentioned above) preventing the use of Shannon information entropy and 
(b) data represented by P(k) can be quite adequately described in terms of 
only two parameters: the Tsallis entropy parameter q and the mean number 
of links (k) q = Ao, i.e., their informational capacity is rather limited. 

The question now is: what is the physical meaning of the q parameter 
in the context of stochastic networks? There is a long list of possibilities in 
what concerns of the origin of q 7^ 1 to be found in the literature dealing with 
nonextensivity |22U21| . Out of these we shall only mention two: fluctuations 
and correlations. In [2^ it was demonstrated that q reflects fluctuations of 
the parameter Ao in exponential distribution Q above. In fact, it turns out 
that (q — l)/q = ±Var (1/A)/(1/A) 2 . As is known from other branches of 
physics where Tsallis statistics is applied the appearance of q can also 
be caused by some correlations existing in the system under consideration. 
Such correlations (resulting, for example, from preferential attachements 
and "rich-get-richer" phenomenon seem to play a decisive role in the 
description of stochastic networks. Therefore when choosing vertices with 
connectivity k, to which a new vertex is going to be connected, we shall 
assume that it will do so with probability that depends on the connectivity 
k. To illustrate this point let us introduce in the evolution equation 

parameter A = A(fc) given by a simple linear function of k: 

X(k) = fr + fa- 1 )*]. (14) 

It is easy to see that in this case one gets immediately P q {k) in the form of 
eq. (fTU|) . Notice that for q — > 1 (i.e., for A — > Ao) one recovers the exponen- 
tial distribution 5 . 

It must be stressed, however, that the information theory approach leads 
in a natural way (via maximalization of the respective information entropy) 



Eq. 11311 can be derived by dividing master equation, dP(k)/dt — —cP(k), by 
the "growth of network" dk/dt. One gets then evolution equation, dP(k)/dk = 
—cP(k)dt/dk, which for the linear dependence of the grow of network assumed here, 
dk/dt — a + bk, leads to eq. with A(fc) = (a + bk)/c. Our example considered 
here corresponds to the choice: c = 1/Ao, a — 1/q and b = (1 — l/q)/\o- Notice that 
1/q plays role of weight with which we select the constant and linear terms in the 
equation describing the growth of network. Notice also that this kind of the growth 
of network, i.e., its dependence on k (cf. corresponds to selective probability 

n(fc), which leads to power- like distribution 
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only to equilibrium (or stationary) distributions P q (k), whereas in models 
describing evolving complex networks [111 17] the functional form of P(k) is 
determined by the growth equation dk/dt. Introducing more complicated 
network evolution than the one presented above when deriving eq.Q), for 
example allowing for the occurence of local events in the form of internal 
edges and rewirings, one gets (see eq. (Ill) of 7 ) 



P(k) ~ [k + K(p,r,m)]-^ r ' m \ 



(15) 



where p is the probability that one is adding m new edges to the system, 
r is probability that one is rewiring to edges and 1 — p — r is probabibility 
that one is adding a new node to the system. The k and 7 in ea. (|15j) are 
given by 0: 



k(p, r, to) = A(p, r, to) + 1 
where, in turn, 

A(p, r, to) 
B(p, r, to) 



and 







7(p, r, to) = B(p, r, to) + 1, (16) 
2to(1 — r) 



+ 1 



1 — p — r 
2m(l — r) + l— p — r 



in 



(17) 



One can now write formally ea. (|15|) in the form resembling eq.^UJ), i.e., as 



P(k) ~ [k(p, r, to)] 



-■y(p,r,m) 



1 



(1 



k 



1 

l-q 



(1 — q)K(p, r, to) 

and identify 7(7?, r, to) = q/(q — 1) and Aq = (1 — q)n(p, r, to), or 



1 + — and 



An 



A + l 
B 



(18) 



(19) 



However, it must be noticed that, at least in the example considered here, 
the limit q = 1 cannot be achieved because the quantity B above is finite 
(for all reasonable values of parameters [7j). It means then that formula 
(fTU|) is more general and captures (by means of parameters q and An) some 
additional feature of complex networks, not present in its simple formula- 
tion (as, for example, given by ea. ()15|0 . 



To summarize: We have demonstrated that, in order to apply the infor- 
mation theory approach to analysis of stochastic networks one has to use 
the nonextensive Tsallis information entropy S q H2] leading to distribution 
P q {k) as given by ea.(|lU|), As shown in Fig. 1, such distribution provides 
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satisfactory description of data on number of links in random networks in 
the whole range of variable k by means of only two parameters: the mean 
value of k and the parameter characterizing the type of information entropy 
to be chosen, q. In this way one describes such disparate situations as the 
exponential model ER ^5] (for q = 1) and the scale-free, power-like mod- 
els (with q = 7/(7 — 1)). For the value of nonextensivity parametr 
q = 3/2, for which variance of our system is divergent, one obtains the expo- 
nent 7 = 3, which seems to be limited value observed in analyses of diverse 
systems displaying complex topology. Although only one example has been 
shown here in Fig. 1, it is obvious that one can just as easily also fit other, 
similar results discussed in the literature (cf., for example, |Sj) 6 . The 
other point is the possible systematics of the q parameter emerging from 
such a search, but this problem is outside of the scope of our presentation. 

We conclude by saying that from the information theory point of view 
eq. (|1(J|) could be used to fit different data providing a pair of numbers 
(q, Ao) for each example. All competing models could be then checked for 
their ability to correctly reproduce these (q, Ao) and all models reproducing 
them correctly should be regarded as equally good from the point of view 
of distribution P(k) because, according to the philosophy of infromation 
theory approach, they apparently contain the same amount of information 
existing in data which have been used. To distinguish between them fur- 
ther one would have to use some additional information contained in other 
network measures like, for example, clustering coefficient, distance between 
nodes, cycles or graph spectra. 



6 For the most recent application of Tsallis statistics to investigation of Internet traffic 
problems see |23| . for the previous one see |18| . 
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